sapphirine R PackageSAPPHIRINE: Sensor-based Analysis of Pollution in the Philadelphia Region with Information on Neighborhoods and the Environment
SAPPHIRINE integrates pollution and geospatial data relevant to investigators, citizen scientists, and policy makers in the Greater Philadelphia Area. SAPPHIRINE’s capabilities include providing interactive maps and customizable data retrieval to aid in the visual identification of pollution and other factor hotspots as well as hypothesis generation regarding relationships among these factors and health outcomes. Data for pollution originate from AirCasting and PurpleAir crowdsourced databases (http://aircasting.org/mobile_map, https://www.purpleair.com/sensorlist), and data for crime, Area Deprivation Index, and traffic originate from OpenDataPhilly (https://www.opendataphilly.org/dataset/crime-incidents), Neighborhood Atlas (https://www.neighborhoodatlas.medicine.wisc.edu/), and PennDOT (https://data-pennshare.opendata.arcgis.com/datasets/rmstraffic-traffic-volumes/data), respectively. Data for pollution were also colected with AirBeam sensors in a pilot study by the Himes Lab, funded by CEET (http://ceet.upenn.edu/). EPA pollution data were downloaded from the EPA Air Data Portal (https://aqs.epa.gov/aqsweb/airdata/download_files.html).
The sapphirine package comes with 2 comprehensive data sets: local.data, which includes sensor-based data for temperature, humidity, PM1, PM2.5, and PM10 as well as crime data; and EPA.data, which includes data for PM2.5, PM10, SO2, NO2, O3, and CO measured at EPA monitoring stations. Data for Traffic and Area Deprivation Index are pre-rendered in raster format and can be accessed from traffic.raster and ADI_data, respectively.
Install the prerequisite R packages if they do not exist:
library(devtools)
if('sapphirine' !%in% installed.packages()){
devtools::install_github("HimesGroup/sapphirine", subdir = "pkg/sapphirine")
}
Load package:
library(sapphirine)
Select subset of of Greater Philadelphia Area counties to generate a custom shapefile for data subsetting and raster generation. Call sapphirine::GPACountyNames for a list of county names.
counties <- c('Bucks', 'Chester', 'Delaware', 'Montgomery', 'Philadelphia')
cty.shape <- selectGPACounties(counties)
raster::plot(cty.shape)
Subset local.data according to custom parameters. Here, we include data within the counties selected above and measured between 1 June 2017 - 31 May 2019 and between 16:00-18:00 U.S. Eastern Time each day.
my.local.data <- customLocalData(cty.shape, '2017-06-01', '2019-05-31', '16:00', '18:00')
head(data.frame(my.local.data))
## Timestamp Latitude Longitude Sensor.ID Temperature
## 1 2017-11-14 17:03:30 39.94724 -75.39016 AirBeam:00189610619E 15
## 2 2017-11-14 17:03:40 39.94723 -75.39022 AirBeam:00189610619E 15
## 3 2017-11-14 17:03:50 39.94723 -75.39024 AirBeam:00189610619E 15
## 4 2017-11-14 17:04:00 39.94730 -75.39030 AirBeam:00189610619E 15
## 5 2017-11-14 17:04:10 39.94732 -75.39032 AirBeam:00189610619E 15
## 6 2017-11-14 17:04:20 39.94732 -75.39032 AirBeam:00189610619E 15
## Humidity PM1 PM2.5 PM10 Crime Count
## 1 43.0 NA 10.560 NA NA 1
## 2 42.8 NA 9.396 NA NA 1
## 3 42.0 NA 8.283 NA NA 1
## 4 42.0 NA 7.848 NA NA 1
## 5 41.8 NA 7.166 NA NA 1
## 6 41.0 NA 6.728 NA NA 1
Create a brick of raster layers corresponding to local.data variables. Here, we use the local dataset and shapefile created above and use 100x100 resolution.
local.ras <- localRaster(my.local.data, cty.shape, 100, 100)
methods::show(local.ras)
## class : RasterBrick
## dimensions : 100, 100, 10000, 13 (nrow, ncol, ncell, nlayers)
## resolution : 0.01414947, 0.00887024 (x, y)
## extent : -76.13645, -74.7215, 39.72156, 40.60858 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## source : memory
## names : Temperature, Humidity, PM1, PM2.5, PM10, Crime, Temperature.log.density., Humidity.log.density., PM1.log.density., PM2.5.log.density., PM10.log.density., Poverty, Traffic
## min values : 9.885333, 18.841250, 0.000000, 0.345000, 0.252500, 1.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 2.000000, 73.000000
## max values : 38.350000, 82.879531, 10.311563, 68.921215, 21.718763, 1487.000000, 3.723948, 3.723948, 3.725340, 3.725667, 3.725993, 100.000000, 72822.790747
Render a leaflet map display for a brick of local data rasters. Here, we create a map display for local.ras and project the boundaries of the counties selected above.
my.local.map <- localMap(local.ras, bounds = cty.shape)
tagList(my.local.map)
Subset the EPA.data according to custom parameters. Here, we include data within the counties selected above and measured between 1 June 2017 - 31 May 2019.
my.epa.data <- customEPAData(cty.shape, '2017-06-01', '2019-05-31')
head(data.frame(my.epa.data))
## Date Longitude Latitude AQS_Site_ID Local.Site.Name State.Name
## 1 2017-06-01 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## 2 2017-06-01 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## 3 2017-06-01 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## 4 2017-06-01 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## 5 2017-06-02 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## 6 2017-06-02 -75.08083 39.99139 42-101-0048 North East Waste (NEW) Pennsylvania
## City.Name Monitor_Start_Date Last_Sample_Date PM2.5.ug.m.3. PM10.ug.m.3.
## 1 Philadelphia 2013-10-01 2020-09-30 7.637500 NA
## 2 Philadelphia 2013-10-01 2020-09-30 7.600000 NA
## 3 Philadelphia 2013-10-01 2020-09-30 7.637500 NA
## 4 Philadelphia 2013-10-01 2020-09-30 7.600000 NA
## 5 Philadelphia 2013-10-01 2020-09-30 7.066667 NA
## 6 Philadelphia 2013-10-01 2020-09-30 7.000000 NA
## SO2.ppb. NO2.ppb. O3.ppb. CO.ppb.
## 1 1.140000 NA 41.059 0.3
## 2 1.140000 NA 41.059 0.3
## 3 1.116667 NA 41.059 0.3
## 4 1.116667 NA 41.059 0.3
## 5 1.225000 NA 29.118 0.3
## 6 1.225000 NA 29.118 0.3
Create a brick of 1km x 1km raster layers rendered with inverse-distance-weighted interpolation of EPA data. Here, we use the EPA dataset and shapefile created above.
epa.ras <- intEPARaster(my.epa.data, cty.shape)
methods::show(epa.ras)
## class : EPARasterBrick
## dimensions : 98, 157, 15386, 6 (nrow, ncol, ncell, nlayers)
## resolution : 0.00901466, 0.009013171 (x, y)
## extent : -76.13873, -74.72342, 39.72199, 40.60528 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## source : memory
## names : PM2.5, PM10, SO2, NO2, O3, CO
## min values : 7.9109829, 18.3353028, 0.3246053, 8.7085803, 25.0908216, 0.2633197
## max values : 10.6409376, 18.3353028, 0.9282857, 11.2932587, 30.7403024, 0.4161934
Render a leaflet map display for a brick of interpolated EPA data rasters. Here, we create a map display for epa.ras and project the boundaries of the counties selected above.
my.epa.map <- EPAMap(epa.ras, bounds = cty.shape)
tagList(my.epa.map)